May 23, 2021
Materials at https://github.com/CEIDatUGA/ECOL8540-datavis

Aesthetics

Design Criteria

Consistency (-Heer)

The image properties (visual variables) should match the data properties. TRANSLATION: Don’t lie (even by ommission).

Ordering (-Heer)

Encode the most important information in the most effective way.

Effectiveness (-Mackinlay)

The most effective visualization conveys information in the most perceivable way. (Can be decoded the fastest and most accurately.)

Expressiveness (-Mackinlay)

Data is expressible in a visual language if the signs express all the facts and only the facts in the data.

Expressiveness (-Mackinlay)


Visual language cannot express all the facts in the data.


Visual language expresses facts not in the data.

Tufte’s Principles

Integrity (Tufte)

  • graphics proportional to data
    (e.g. zero shown, etc.)
    • perceptually proportional
      (Perception is not always linear.)
    • avoid 3D / perspecitve
      due to increased ambiguity
  • use labels to resolve graphical ambiguity
    • label on the graph itself
    • label important events
  • data dimensions = graphical dimensions
    • Vary only one graphic element
      for each data dimension
  • don’t quote data out of context

Excellence (Tufte)

  • Clarity
  • Precision
  • Efficiency

Greatest number of ideas

in shortest time

with the least ink

in the smallest space

MISC.

  • Multiple y axes: don’t do it.
    (use panels instead)

Tufte’s Principles

Data-Ink Ratio

\[ \begin{align} \text{Data-ink Ratio } ~= ~&\frac{\text{data-ink}}{\text{total ink used to print the graphic}} \\\\ = ~&\text{proportion of a graphic's ink devoted to the} \\ ~&\text{non-redundant display of data-information} \\\\ = ~&1 - (\text{proportion of a graphic's ink that can be erased} \\ ~&\text{without loss of data-information}) \\ \end{align} \]

Data-Ink Ratio

  • Show the Data
  • Maximize the data-ink ratio (within reason)
  • Erase non-data ink (within reason)
  • Erase redundant data-ink (within reason)
  • Revise / Edit

Data-Ink Ratio

bad

better

best

Data-Ink Ratio

original - redundant = “the good part”

Tufte’s Principles

Chart Junk

Color

Perceptually uniform color scales

Theory:

http://datavis-sp16.github.io/lectures/color

Takeaway:

Use the HCL colorspace (a perceptually uniform color space) (Hue, Saturation, Lightness)

Lightness is the most important and most accurate perceptual channel

R color tools:

  • Base R does not enforce good default colors
  • {ggplot}, {plotly}, and many other R packages have good defaults.
  • The {colorspace} package lets you build HCL color palettes
    GUI: http://hclwizard.org/r-colorspace/ can also be launched from R: colorspace::choose_palette()
  • The {datacolor} package makes it easier to work with HCL, and also analyzes color palettes
    https://github.com/allopole/datacolor

Problem: Color palette for a binned, continuous variable. (e.g. show absolute rainfall quantities AND “low,” medium” and “high” rainfall categories)

Palette from http://www.hclwizard.org/why-hcl/ recreated with datacolor R package

# install.packages(devtools) ## imports {remotes} package
# remotes::install_github("allopole/datacolor")
n <- 12 # palette length
why_hcl <- datacolor::hcl2hex(
  L=100*datacolor::rampx(from = .95, to = .35, n, exponent = 1.65),
  C=100*datacolor::stepx(from = .2, to = .77, n, step.n = n/4),
  H=datacolor::stepx(from = 65,to = 320, n=n, step.n = n/4)
  )
why_hcl
##  [1] "#FCEFD9" "#F8ECD6" "#F1E5CF" "#A6EBC9" "#9ADFBD" "#8CD1AF" "#70BBEA"
##  [8] "#5BAAD9" "#4298C6" "#C34FAC" "#AE3398" "#9A0084"
datacolor::colorbar(why_hcl)

datacolor::colorplot(why_hcl,colorblind=T)

Readings

Kelleher, C. and Wagener, T., 2011. Ten guidelines for effective data visualization in scientific publications. Environmental Modelling & Software, 26(6), pp.822-827.

Rougier, N.P., Droettboom, M. and Bourne, P.E., 2014. Ten simple rules for better figures. PLoS Comput Biol, 10(9), p.e1003833.

Gregor Aisch (former graphics editor, New York Times). Using Data Visualization to Find Insights in Data. DataJournalism.com

Hadley Wickham (2010) A Layered Grammar of Graphics. Journal of Computational and Graphical Statistics 19:3-28.

Tufte, Edward:

  • The Visual Display of Quantitative Information. Graphics Press, 1983.
  • Envisioning Information. Graphics Press, 1990.
  • The Visual Design of Quantitative Information. Graphics Press, 1992.
  • Visual Explanations: Images and Quantities, Evidence and Narrative. Graphics Press, 1997.